Skip to content

Conversation

vinothkumarr227
Copy link
Contributor

Fixes: #7049

RELEASE NOTES: None

Copy link

codecov bot commented Sep 23, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.11%. Comparing base (7235bb7) to head (f1c77ee).
⚠️ Report is 23 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8602      +/-   ##
==========================================
+ Coverage   81.86%   82.11%   +0.24%     
==========================================
  Files         415      415              
  Lines       40694    40711      +17     
==========================================
+ Hits        33316    33429     +113     
+ Misses       5993     5900      -93     
+ Partials     1385     1382       -3     

see 40 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment on lines 189 to 193
"",
"unix://a/b/c",
"unix://authority",
"unix-abstract://authority/a/b/c",
"unix-abstract://authority",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not remove these test cases. Since NewClient doesn't connect like Dial, we need to do the following to get an error instead:

  1. Call cc.Connect()
  2. Wait for TRANSIENT_FAILURE state using AwaitState. If using this function causes a circular dependency, copy it over to this file, adding a comment that mentioning the reason for the duplication.
  3. If context times out before TF state, fail the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using Idle because these malformed targets result in zero resolved addresses. The client never attempts a connection, so it stays IDLE and cannot reach TRANSIENT_FAILURE

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an INFO log similar to the following from the unix resolver:

INFO clientconn.go:333 [core] [Channel #5] Channel failed to exit idle mode: invalid (non-empty) authority: authority (t=+1.01919ms)

Since this is the only indication that the connection attempt failed (no errors are returned), I think this should be logged as an ERROR. Presently, the method responsible for channelz trace event logging is logging everything at as an INFO log. We could have at accept the severity as an additional param to achieve this. Once this is an error log, we can assert that the error is logged similar to the following.

grpctest.ExpectError("Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII \"too_many_pings\"")

@dfawley do you think we should log the failure to exit idle mode in cc.Connect() as an ERROR instead of INFO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two options:

  1. Change the unix name resolver not to error from Build when this happens. Instead, have it produce a ResolverError and otherwise do nothing.

  2. Handle resolver.Build errors so that all name resolvers that error during Build will instead effectively produce a ResolverError. This fixes that class of problem in the same way for all name resolvers.

Either way, RPCs will fail with that error message text, and we shouldn't need anything besides INFO (or even V2?) logs.

@arjan-bal arjan-bal added the Type: Internal Cleanup Refactors, etc label Sep 25, 2025
@arjan-bal arjan-bal added this to the 1.77 Release milestone Sep 25, 2025
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
cc, err := NewClient(target, WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are changing the check condition here , is it intentional? If it is , the error message isn't correct. We are checking err!=nil which means NewClient failed and the error message says NewClient succeeded which is not right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

cc, err := Dial(test.target, WithTransportCredentials(insecure.NewCredentials()), WithContextDialer(dialer))
cc, err := NewClient(test.target, WithTransportCredentials(insecure.NewCredentials()), withDefaultScheme(defScheme), WithContextDialer(dialer))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should override the default scheme to passthrough with new client because we want to check the behavior of parsed strings with defualt NewClient function with custom dialer. And the behavior might be different , because dns resolver also does its own parsing , so the results might be different. Correct me if I have understood incorrectly. But I think we should have a separate test for NewClient's default parsed address behavior with custom dialer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your feedback, I used withDefaultScheme only in the specific tests where it’s needed, as suggested by Arjan. For checking the default NewClient behavior with a custom dialer, I’m not overriding the scheme. If you’d like a separate test specifically for NewClient's default behavior, let me know and I can add it.

if err != nil {
t.Fatalf("Dial failed. Err: %v", err)
}
client.Connect()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we add this here? We did not change from Dial to NewClient in the test and Dial will connect automatically?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to change, I reverted it back.

Comment on lines 189 to 193
"",
"unix://a/b/c",
"unix://authority",
"unix-abstract://authority/a/b/c",
"unix-abstract://authority",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an INFO log similar to the following from the unix resolver:

INFO clientconn.go:333 [core] [Channel #5] Channel failed to exit idle mode: invalid (non-empty) authority: authority (t=+1.01919ms)

Since this is the only indication that the connection attempt failed (no errors are returned), I think this should be logged as an ERROR. Presently, the method responsible for channelz trace event logging is logging everything at as an INFO log. We could have at accept the severity as an additional param to achieve this. Once this is an error log, we can assert that the error is logged similar to the following.

grpctest.ExpectError("Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII \"too_many_pings\"")

@dfawley do you think we should log the failure to exit idle mode in cc.Connect() as an ERROR instead of INFO.

}
defer cc.Close()
cc.Connect()
testutils.AwaitState(ctx, t, cc, connectivity.Idle)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty much a no-op since the channel starts in IDLE. A better check would be to assert testutils.AwaitState with a short context. We should wait for a reply on my other comment though, it may change the assertion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I'm still a little unclear on what I should do for this test right now.

resolver_test.go Outdated
}
cc.Connect()
if got, want := <-addrCh, target; got != want {
if got, want := <-addrCh, "caseTest2:///localhost:1234"; got != want {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we revert this change? It seems to be a no-op since the target variable has the same value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@arjan-bal arjan-bal assigned dfawley and vinothkumarr227 and unassigned arjan-bal Oct 6, 2025
@dfawley dfawley removed their assignment Oct 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update docs and examples and tests to use NewClient instead of Dial
5 participants